21 research outputs found

    HCM: Hardware-Aware Complexity Metric for Neural Network Architectures

    Full text link
    Convolutional Neural Networks (CNNs) have become common in many fields including computer vision, speech recognition, and natural language processing. Although CNN hardware accelerators are already included as part of many SoC architectures, the task of achieving high accuracy on resource-restricted devices is still considered challenging, mainly due to the vast number of design parameters that need to be balanced to achieve an efficient solution. Quantization techniques, when applied to the network parameters, lead to a reduction of power and area and may also change the ratio between communication and computation. As a result, some algorithmic solutions may suffer from lack of memory bandwidth or computational resources and fail to achieve the expected performance due to hardware constraints. Thus, the system designer and the micro-architect need to understand at early development stages the impact of their high-level decisions (e.g., the architecture of the CNN and the amount of bits used to represent its parameters) on the final product (e.g., the expected power saving, area, and accuracy). Unfortunately, existing tools fall short of supporting such decisions. This paper introduces a hardware-aware complexity metric that aims to assist the system designer of the neural network architectures, through the entire project lifetime (especially at its early stages) by predicting the impact of architectural and micro-architectural decisions on the final product. We demonstrate how the proposed metric can help evaluate different design alternatives of neural network models on resource-restricted devices such as real-time embedded systems, and to avoid making design mistakes at early stages

    Using Value Prediction to Increase the Power of Speculative Execution Hardware

    No full text
    This paper presents an experimental and analytical study of value prediction and its impact on speculative execution in superscalar microprocessors. Value prediction is a new paradigm that suggests predicting outcome values of operations (at run-time) and using these predicted values to trigger the execution of true-data dependent operations speculatively. As a result, stalls to memory locations can be reduced and the amount of instruction-level parallelism can be extended beyond the limits of the program’s dataflow graph. This paper examines the characteristics of the value prediction concept from two perspectives: 1. the related phenomena that are reflected in the nature of computer programs, and 2. the significance of these phenomena to boosting instruction-level parallelism of super-scalar microprocessors that support speculative execution. In order to better understand these characteristics, our work combines both analytical and experimental studies.

    Electromigration-Aware Memory Hierarchy Architecture

    No full text
    New mission-critical applications, such as autonomous vehicles and life-support systems, set a high bar for the reliability of modern microprocessors that operate in highly challenging conditions. However, while cutting-edge integrated circuit (IC) technologies have intensified microprocessors by providing remarkable reductions in the silicon area and power consumption, they also introduce new reliability challenges through the complex design rules they impose, creating a significant hurdle in the design process. In this paper, we focus on electromigration (EM), which is a crucial factor impacting IC reliability. EM refers to the degradation process of IC metal nets when used for both power supply and interconnecting signals. Typically, EM concerns have been addressed at the backend, circuit, and layout levels, where EM rules are enforced assuming extreme conditions to identify and resolve violations. This study presents new techniques that leverage architectural features to mitigate the effect of EM on the memory hierarchy of modern microprocessors. Architectural approaches can reduce the complexity of solving EM-related violations, and they can also complement and enhance common existing methods. In this study, we present a comprehensive simulation analysis that demonstrates how the proposed solution can significantly extend the lifetime of a microprocessor’s memory hierarchy with minimal overhead in terms of performance, power, and area while relaxing EM design efforts

    Electromigration-Aware Architecture for Modern Microprocessors

    No full text
    Reliability is a fundamental requirement in microprocessors that guarantees correct execution over their lifetimes. The reliability-related design rules depend on the process technology and device operating conditions. To meet reliability requirements, advanced process nodes impose challenging design rules, which place a major burden on the VLSI implementation flow because they impose severe physical constraints. This paper focuses on electromigration (EM), one of the critical factors affecting semiconductor reliability. EM is the aging process of on-die wires in integrated circuits (ICs). Traditionally, EM issues have been handled at the physical design level, which enforces reliability rules using worst-case scenario analysis to detect and solve violations. In this paper, we offer solutions that exploit architectural characteristics to reduce EM impact. The use of architectural methods can simplify EM solutions, and such methods can be incorporated with standard physical-design-based solutions to enhance current methods. Our comprehensive physical simulation results show that, with minimal area, power, and performance overhead, the proposed solution can relax EM design efforts and significantly extend a microprocessor’s lifetime

    The Effect of Instruction Fetch Bandwidth on Value Prediction

    No full text
    Value prediction attempts to eliminate true-data dependencies by dynamically predicting the outcome values of instructions and executing true-data dependent instructions based on that prediction. In this paper we attempt to understand the limitations of using this paradigm in realistic machines. We show that the instruction-fetch bandwidth and the issue rate have a very significant impact on the efficiency of value prediction. In addition, we study how recent techniques to improve the instruction-fetch rate affect the efficiency of value prediction and its hardware organization. 1. Introduction The fast growing density of gates on a silicon die, allows modern microprocessors to increasingly employ multiple execution units that are capable of executing several instructions in parallel. Most of the recent microprocessor architectures assume sequential programs as an input and a parallel execution model, where the hardware is expected to extract the parallelism at run-time out of the ins..
    corecore